Online Control With Least-Squares Methods

ثبت نشده
چکیده

Policy evaluation using least-squares techniques (such as LSTD and iLSTD) have been shown to estimate the value of a policy with far less data than traditional TD techniques. Unfortunately, they make use of policy-dependent statistics that have to be discarded when the policy changes. This makes it difficult to use the techniques for online control problems. In this paper, we explore the effect of policy on the least-squares statistics, distinguishing three fundamental effects. We then introduce the framework of least-squares Sarsa (LSS and iLSS) and empirically evaluate previously suggested approaches for handling data from older policies in the least-squares statistics. We show these approaches can maintain the leastsquares data efficiency in some control problems, identify circumstances where least-squares approaches can be problematic and where special handling of data from older policies improves learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combined Estimation and Optimal Control of Batch Membrane Processes

In this paper, we deal with the model-based time-optimal operation of a batch diafiltration process in the presence of membrane fouling. Membrane fouling poses one of the major problems in the field of membrane processes. We model the fouling behavior and estimate its parameters using various methods. Least-squares, least-squares with a moving horizon, recursive least-squares methods and the ex...

متن کامل

Online Controller Tuning via FRIT and Recursive Least-Squares

This paper proposes an online type of controller parameter tuning method by modifying the standard fictitious reference iterative tuning method and by utilizing the so-called recursive least-squares (RLS) algorithm, which can cope with variation of plant characteristics adaptively. As used in many applications, the RLS algorithm with a forgetting factor is also applied to give more weight to mo...

متن کامل

Online Tuning Strategy for Multi-loop SISO PI Control Algorithms in Multivariable Interactive Systems

Tuning of PI control algorithms for coupled multi input multi output (MIMO) systems is a challenging problem. This paper extends a previously developed model-based adaptive tuning method to handle the tuning problem of coupled multivariable systems. The performance of the proposed method is compared to those of existing methods such as Biggest Log Modulus (BLT), Sequential Loop Closing (SLC) an...

متن کامل

Simultaneous Model Predictive Control and Identification: Closed-Loop Properties

Model Predictive Control and Identification is an adaptive control technique which solves an online optimization problem to find process inputs for dual control problem. Its main goal is to bring robustness, to Model Predictive Control, increase the capability to handle uncertainties and time varying parameters in the processes. Theoretical properties, such as feasibility of the optimization pr...

متن کامل

Least-squares methods for policy iteration

Approximate reinforcement learning deals with the essential problem of applying reinforcement learning in large and continuous state-action spaces, by using function approximators to represent the solution. This chapter reviews least-squares methods for policy iteration, an important class of algorithms for approximate reinforcement learning. We discuss three techniques for solving the core, po...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007